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Abstract 



We show that for sets with the Hausdorff-Besicovitch dimension equal zero 
the box counting algorithm commonly used to calculate Renyi exponents (dg) 
can exhibit perfect scaling suggesting non zero tig's. Properties of these patho- 
logical sets (pseudofractals) are investigated. Numerical, as well as analyti- 
cal estimates for dq's are obtained. A simple indicator is given to distin- 
Q I guish pseudofractals and fractals in practical applications of the box counting 

Q^ ' method. Histograms made of pseudofractal sets are shown to have Pareto 

5^ ■ tails. 

I. INTRODUCTION 

J^ ■ The notion of fractal has been introduced in 70 's by B. Mandelbrot and soon it has be- 

CN ' come very fashionable. In mathematical sense a set is called fractal (set) when its Hausdorff- 

Besicovitch dimension {dHs) is greater than its topological dimension [dr) 0]- Since frac- 
O ' tality is strictly related to the physically important self similarity (self affinity), scaling 

symmetries and the renormalization group, it is widely used in physics on all scales: ranging 
from particle [@] to astrophysics 0, and in various other areas, like solid state physics 0] or 
econophysics 0. 

fj ■ However, in contrast to fractal sets constructed by mathematicians like the famous triadic 

>■ ! Cantor set (1883), for physically interesting cases, the algorithms to construct corresponding 

'k>( \ data sets are usually unknown and it is very difficult (or just impossible) to calculate their 

Vh \ Hausdorff-Besicovitch dimension, Renyi exponents etc. in mathematically rigorous way. In- 

stead, one considers a zero dimensional (finite number) subset of data points and one applies 
a standard numerical algorithm, like the box counting (BC) algorithm or its derivatives, that 
gives the well known log-log plot. A good linear fit is assumed to be equivalent to the cal- 
culation of the corresponding fractal dimensions. Apart from the fact that the above fit is 
to some extent arbitrary (see e.g. 0) and there is no good method to calculate "error bars", 
it will be shown in the following sections that even a perfect fit can be misleading. 

In fact, there are many different mathematical definitions of the fractal (capacity) di- 
mension that can give different results when applied to the fractal set. They have been 
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originally introduced to physics to characterize strange attractors of dynamical systems [0 . 
The BC capacity dimension is closest to the dimension introduced by Kolmogorov [Q. 

Here, we limit our discussion to the BC method that is the basic paradigm for practical 
computation of the generalized Renyi exponents, dg, defined by P,p!0[] 

,, = J- „„, h^iAM ^ „,„ hXm , (1) 

where A^ is the total number of "boxes" (bins), pi is the part of the "mass" {i.e. fraction 
of all points) contained in the i-th box. Also, in this paper we deal with sets that are not 
fractals, namely the discrete (point) sets. For these sets one defines 

P.(N) = ^^ , (2) 

IT'tot 

where ni{N) is the number of data points ("mass") in the i-th box for a given subdivision 
(partition) A^ and ritot is the total number of data points ("mass") contained in all boxes. 
In the case g = (capacity dimension) eq. (P becomes 

1. InM(Ar) 
do = lim , l^ ' , (3) 

N^oo InA^ ^ ' 

where M{N) denotes just the number of non empty boxes. In this case the number of data 
points in particular boxes is irrelevant and this singles out the value g = 0. This is the 
reason why the BC method gives a unique result for do (see Sec. II). dg is determined from 
the log-log plot of logF(A^) vs. logN with A^ = 2°, . . . , 2^=, usually with A; ~ 10 H- 30. 

Since in practical computations with the BC and derivative methods one always deals 
with finite number of data points, we limit our analysis to the discrete sets. In the following 
Section we obtain an analytic expression for do in case of the set defined by |[TT|JT^ 



x„ = — , n = 1,2,... , a > . (4) 



The same method is also applied for general discrete sets with an accumulation point, as 
well as for divergent series. In Sec. Ill the BC algorithm is applied to calculate the Renyi 
exponents with g 7^ for (^. The excellent scaling (linear fit) has been found in full 
agreement with analytical estimates, in spite of the fact that the set (^) is not a fractal and 
has the Hausdorff-Besicovitch dimension equal to zero. Also, it is shown that the standard 
BC method leads to a violation of the Hentschel-Procaccia inequality ||10[. A modification 
of the standard BC method which preserves the HP inequality is analyzed as well. Our 
results can be generalized to sets with an arbitrary number of accumulation points. In Sec. 
IV it is shown that pseudofractals generate histograms with fat tails, in contrast to fractals. 
The final discussion is given in the last Section. 

II. CAPACITY DIMENSION OF PSEUDOFRACTALS 

Clearly, the discrete and countable set (|^) is not a fractal and it has zero Hausdorff- 
Besicovitch dimension. However, as has been demonstrated using the dimension function 



TT| or by direct application of the BC method |T^, numerical computation must give the 



following analytic result 



d„ = — . (5) 



As the method of analytical estimates of [O and its generalization will be used through- 



out the paper, we now describe it briefly. Assuming the unit size of the whole set, the size 
of a single bin is 1/A^. Denoting by Ngngi number of bins (and by Ugngi = Ngngi number of 
corresponding data points) with one and only one data point inside, one can easily calculate 
from M) 



Usngl = Nsngl ^ N ^+- . (6) 

Since we have logarithm in (^ and (^, and the limit N -^ oo, the constant pre-factor can 
be neglected. The remaining data points (n,.) are closer to each other than the bin size. 
Hence, all those bins are not empty. Number of such bins (A^,. < n^) is equal to the distance 
of the point Xn^^gi from the accumulation point (xoo = 0) divided by the bin size (1/A^). 
This gives the estimate 
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A"^ ~ A^^ , (7) 

and in the limit N ^ oo 

M{N) ~ Nsngl + A'r ~ A^^ + A^i ~ N^ , (8) 



that implies the result (||). 

From the above proof it is clear that the exponent d^ depends on the rate of change 
of the distances between neighboring points ("level spacing") with respect to the length of 
the whole interval or, in other words, on the speed of convergence of data points to the 
accumulation point. This enables us to generalize the above result. Also, one can consider 
divergent sets (x„ -^ cxd for n —>■ oo) by rescaling them to the unit interval. To this end, we 
define the convergence rate Ax{n) by 

[\Xn- Xn+l\/\Xn\ tor l^ool = OO . 

This gives the following general formula for the exponent do 

— In Ti I 

n^ooinAx(n) J 

In particular for (^, one gets Ax{n) ~ 1/n" that leads to formula (^. For slowly converging 
series, like 1/lnn, one has do = 1, while for strong convergence {e.g. Xn = e~"") there is 
do = 0. On the other hand for all diverging series (like n", e^"" or Inn) there is always do = 1. 
Intuitively, slowly converging series look like uniformly distributed, while those exponentially 
converging look like concentrated at the accumulation point (zero dimensional). Hence, from 
this point of view, the series with inverse power asymptotic are the only non trivial ones. 



The above results can be verified numerically by applying the BC method. The results 
are displayed in Fig. |l|, where straight lines correspond to the theoretical predictions. Ac- 
tually, already for 10^ data points one can see an excellent linear scaling in the log-log plot 
throughout more than dozen of binary orders of magnitude — well above of what is usually 
demanded in practical applications. In addition, the results are in perfect agreement with 
formula ([T0|) : do = 0.50, 0.66 and 0.33 for a = 1, 0.5 and 2, respectively, while for divergent 
series, y/n and n^ (crosses and circles), one obtains do = 1.0. 

III. PSEUDOFRACTALS AND GENERALIZED RENYI EXPONENTS 

For q ^ analytical estimates are ambiguous as we have to deal with the double limit: 
limAT^oo linintot^oo, because the probabilities {pi = Pi{ni, ritot, N)) do depend on both, N and 
Utot- Equivalently, the measure (0) is not well defined. In standard applications of the BC 
method one has fixed number of data points {ntot = const.) and the large N limit is being 
estimated. In this case, for q < one can estimate the sum in (|l|) as in the derivation of 
(H), by taking partial sum with bins containing only one data point. Namely, 



^^ sngl 



In y^ p?= In 



A^T+^ 



1 



f^tot 



1 1 , Ar 

= const H In JM . 

1 — ql + a 

The upper limit can be estimated assuming equal number of data points in remaining bins 
{Nr ~ N^^""). One should remember, that due to the limited number of data points the 
number of bins cannot be too large: 1 -C A^ < nj^^"'. For finer partitions we reach the 
saturation point — there is a constant number of non empty bins with exactly one data 
point inside, that corresponds to the value of logY^ax = log^tot (see Figs. |I] and 0(A), 
where \0g2Ymax = log2 10^ ~ 13.3). Finally, we obtain an analytical estimate for large A^ 

For g > estimates become more complicated, as truncation of the sum can make it smaller 
than one that causes the change of sign of the logarithm. However, for large q one gets fast 
convergence d^ — > 1 (g ^ +00). Again, as it is clear from Fig. @(A), we obtain very good 
linear fits throughout about ten binary orders of magnitude that is usually interpreted as a 
sign of fractality and excellent agreement with the theoretical estimate. 



It has been proven that for fractal sets Renyi exponents dg the HP inequality holds |T^ 



dq < dqi for g > g' . (12) 



However, in our case the calculated scaling exponents apparently violate (|T2|) as can be seen 
form ([Tl|) and from Figs. |3(A) and ^ (full circles and the dashed line). 

Let us notice that calculating dq's analytically, for well defined fractal sets, like the 
triadic Cantor set, the resolution for counting data points increases when the bin number 
is increasing. And the resolution at a given step (for a given partition) is equal to the bin 



size (smallest void intervals are of the bin size). This is in contrast to the standard version 
of the BC method, where the data set is fixed during the whole procedure. Now, let us 
modify the BC method by taking into account for a given partition only those points that 
are separated from each other by at least the (current) bin size {i.e. the bin size fixes the 
resolution). This makes the computation more involved and time consuming but, in effect, 
one can recover the HP inequality. 

For the modified BC method, in the way similar as for the estimate (|n|), one obtains 
the following analytical formula 



dn 



^ 1-q 



q 



1 + a a 



(g < 0) . (13) 



In addition, one has dq ^ for {q —>■ +oo). Clearly ( |T3|) satisfies ([T^) . This result can be 
validated numerically, as is displayed in Fig. | (full squares and the dotted line for eq. (pTS])). 
Again, we have a very good scaling and linear fit. For positive q this method gives dq that 
tends to zero quite fast (while it reaches one, a bit slower, for the standard BC algorithm). 
Hence, for pseudofractals the standard and modified BC algorithms give different results 
due to ambiguity mentioned earlier. However, in both cases a good scaling and linear fit is 
obtained. 

Our conclusions remain unchanged for sets with an arbitrary number of accumulation 
points. In particular, the union of two sets with scaling exponents cig 5 '^o i^ the large 
number of bins {N) limit gives 
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(1) j{2) 



lnA^jVl<'-4- 



max 



= max I (ig , (ig I + 
{d'^\d?'} . (14) 



Like for regular fractals, the scaling exponent of the union is equal to the maximal exponent 
of the two sets. However, notice that convergence to this result (in the large A^ limit) is 
slowest (logarithmic) for dg — (ig . As the number of non empty bins is greater than for a 
single set, points in the log-log plot will be higher for not too large A^. Hence, the whole plot 
will be a bit less steep and for not large enough data sets this can lead to lower estimates 
of the Renyi exponents and worse linear fit. Details of this effect depend on the particular 
distribution of both sets in the embedding interval and have been verified numerically. For 
sets not large enough the scaling can be completely lost. 

IV. PSEUDOFRACTALS AND FAT TAILS 

One is often interested in probability distribution for a large series of data. In particular, 
in recent years there has been a great interest in the so-called Pareto or fat tails |T^, where 
histograms built out of the data have inverse power law tails 

P{x) ~ 1/x^ , (15) 

P{x) being the probability distribution. Here we show that non trivial (0 < (ig < 1) 
pseudofractals do have this property. 



As for the histogram the time ordering of the data points can be neglected, let us consider 
for simplicity a monotonic series {x„ : x„+i < x„}. To satisfy (|15D the number of data points 
(An) in the interval [xn+An,Xn] must be 

An = / P{x) dx = — 

•'^n + An P 



(3-1 (3-1 



where C > is a normalization constant. Substituting Ci = [3/C > and f{n) = l/x^ ^ 
this yields the following simple linear first order difference equation 

Ci An = f{n + An) - f{n) , 

with the general solution f{n) = Ci n + C2 or, equivalently 

1 



Xr; 



[C,n + C2] — 



For tails [n ^ C2/C1 but still far from the accumulation point) the constant C2 can be 
neglected and finally we have the asymptotic behaviour 

Xn — = — ■ (16) 

Hence, the tail exponent /3 can be expressed in terms of a or d^ as 

^ = ^ = ^ . (17) 

a 1 — do 

The above formula displays simple relation between the psudofractal's parameter a, the tail 
index (5 and the box counting exponent d^. 



V. CONCLUSIONS 

In this paper, we investigate general sets with accumulation points, that are not fractals, 
thought they display fractal like scaling behaviour. The scaling exponent do (eq. (|D) as 
obtained by the BC method is given by (plOl) . Furthermore, we have found the analytical 
formula for dq (for g < 0) for the inverse power series as given by the standard BC algorithm 
(eq. ([TI|) ), that perfectly fits to numerical results (Fig. ||(A)). Obtained exponents violate 
the HP inequality, that can be viewed as an indicator of the pseudofractal behaviour. 

Similar results are obtained for the modified BC algorithm (where the number of data 
points taken into account is increasing with the increased resolution), but in this case the HP 
inequality is preserved (see eq. (|13D and Fig.||(B)). Hence, the two schemes give different 



dqS for pseudofractal sets. Our results remain valid for sets with arbitrary number of 
accumulation points, where the overall scaling exponent is equal to the maximal exponent 
of constituent sets. Also, in this case one can observe worsening of the linear fit. 

In general, from the point of view of the fractal properties and the BC methods there 
are four types of sets: 



(i) mathematical fractals - sets that are well defined and their fractal properties can be 
rigorously proven {i.e. without numerical approximations), like the triadic Cantor set. 
(ii) physical fractals - finite sets that are (computer) representations of mathematical frac- 
tals. In this case one gets good scaling and linear fit with the BC method, HP inequality 
holds and both BC and modified BC method (described in Sec. Ill) give the same results. 
(Hi) pseudo fractals - finite sets that are not finite representations of mathematical fractals, 
though they show good scaling and linear fit with the BC method. The resulting exponents 
violate the HP inequality and the BC and modified BC algorithm give different values for 
dqS. The general formula for d^ in case when x„ asymptotic is known is given by (|TD|). 
(iv) non fractals, i.e. sets for which the BC algorithm does not exhibit any scaling. 

The sets of types (i) and (iv) can be easily distinguished. However, it is quite non trivial 
to distinguish between sets of type (ii) and type (Hi). Here, one cannot apply the rigorous 
mathematical machinery as the whole set is usually unknown. In these cases numerical 
methods lead to nice scaling making them impossible to tell apart. In this context, violation 
of the HP inequahty appears to be a simple and useful indicator, in addition to different 
results obtained by the standard and modified BC algorithms. 

Different classes of non trivial (0 < do < 1) pseudofractals have scaling properties equiv- 
alent to the series {x„ = 1/ra"}. In particular, for g > the BC method gives dg close to the 
embedding dimension while for the modified BC algorithm dg approaches zero. For g < 



analytical formulae for dq are given by ([11]) and (0), respectively. 

Finally, as shown by (|1^ , the parameter a of the pseudofractal series is simply related 
to the tail index /?, as well as to the box counting Renyi exponent d^. This means that 
histograms made of non trivial pseudofractal sets have Pareto (fat) tails. This relation is 
another signal of possible pseudofractality. 
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FIG. 1. Log-log plots and analytical predictions (lines) for do with: x„ = 1/n (full circles and 
solid line), l/n^'^ (squares and dashed line), 1/n^ (diamonds and dashed-dotted line), n^'^ and r? 
(crosses and circles with one dotted line for both). 
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FIG. 2. Log-log plots for the harmonic series (x„ = 1/n) with (7 = (crosses), 0.25 (full circles), 
0.5 (squares), 0.75 (diamonds) and 1 (circles) with corresponding linear fits (solid lines). The upper 
panel (A) is for 10^ data points {jitot =const.) and the standard BC method. The lower panel (B) 
is for the modified BC algorithm. 
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FIG. 3. (i(g) = dq computed for the harmonic series (x„ = 1/n) with 10^ data points for the BC 
(fuU circles) and modified BC (full squares) methods. Dashed and dotted lines represent analytic 
estimates (11) and (13), respectively. 
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